Algorithmic patterns for $\mathcal{H}$-matrices on many-core processors

نویسنده

  • Peter Zaspel
چکیده

In this work, we consider the reformulation of hierarchical (H) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing H matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full H matrix construction and the fast matrix-vector product to many-core hardware. Here, crucial ingredients are space filling curves, parallel tree traversal and batching of linear algebra operations. The resulting model GPU implementation hmglib is the, to the best of the authors knowledge, first entirely GPU-based Open Source H matrix library of this kind. We conclude this work by an in-depth performance analysis and a comparative performance study against a standard H matrix library, highlighting profound speedups of our many-core parallel approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Algorithmic patterns for H-matrices on many-core processors

In this work, we consider the reformulation of hierarchical (H) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H matrix oper...

متن کامل

Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications

Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...

متن کامل

Operator frame for $End_{mathcal{A}}^{ast}(mathcal{H})$

‎Frames generalize orthonormal bases and allow representation of all the elements of the space‎. ‎Frames play significant role in signal and image processing‎, ‎which leads to many applications in informatics‎, ‎engineering‎, ‎medicine‎, ‎and probability‎. ‎In this paper‎, ‎we introduce the concepts of operator frame for the space $End_{mathcal{A}}^{ast}(mathcal{H})$ of all adjointable operator...

متن کامل

Structured Parallel Programming with Deterministic Patterns

Many-core processors target improved computational performance by making available various forms of architectural parallelism, including but not limited to multiple cores and vector instructions. However, approaches to parallel programming based on targeting these low-level parallel mechanisms directly leads to overly complex, non-portable, and often unscalable and unreliable code. A more struc...

متن کامل

Non-additive Lie centralizer of infinite strictly upper triangular matrices

‎Let $mathcal{F}$ be an field of zero characteristic and $N_{infty‎}(‎mathcal{F})$ be the algebra of infinite strictly upper triangular‎ ‎matrices with entries in $mathcal{F}$‎, ‎and $f:N_{infty}(mathcal{F}‎)rightarrow N_{infty}(mathcal{F})$ be a non-additive Lie centralizer of $‎N_{infty }(mathcal{F})$; that is‎, ‎a map satisfying that $f([X,Y])=[f(X),Y]$‎ ‎for all $X,Yin N_{infty}(mathcal{F})...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017